Import data

Univariate EDA

Predictor variables

## year
## 1982 1983 1984 1985 1986 1987 1988 
##   48   48   48   48   48   48   48

Question: do we want to round down the drink age? Then make this categorical?

## drinkage
##               18             18.5               19 19.1599998474121 
##               12                3               49                1 
##            19.25             19.5 19.6700000762939               20 
##                2                2                6               20 
##            20.25 20.3299999237061             20.5               21 
##                1                2                6              232

Response variables

Multivariate EDA

mean response plot:

jail and service have shown slight increase by year.
drinkage have mainly shifted from 18-19 in 1982 to 21 in 1988.

##      year
## jail  1982 1983 1984 1985 1986 1987 1988
##   no    39   35   34   33   33   34   33
##   yes    9   13   14   15   15   14   14
##        year
## service 1982 1983 1984 1985 1986 1987 1988
##     no    43   40   39   38   38   38   37
##     yes    5    8    9   10   10   10   10
##             year
## drinkage_cat 1982 1983 1984 1985 1986 1987 1988
##           18    5    4    3    2    1    0    0
##           19   13   13   13   11    8    1    1
##           20    6    6    6    5    3    3    0
##           21   24   25   26   30   36   44   47

dont think unemployment rate is a good predictor, as it doesnt fit our goals from this analysis. can add into the discussion sections tho

percentage of young drivers have decreased over the years as the number of states that changed their drinkage to 21 has increased. however, later on (if i use a mixed effects model) drinkage is not exactly a good predictor (not significant).

Model fitting

Note: goal is to observe whether the type of law implemented to tacker drunk driving can actually prevent fatalities

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = afatal_prop ~ beertax + jail + service + drinkage, 
##     data = Fatalities, model = "within", index = "state")
## 
## Unbalanced Panel: n = 48, T = 6-7, N = 335
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.3434894 -0.0256639 -0.0016049  0.0204173  0.2303012 
## 
## Coefficients:
##              Estimate Std. Error t-value  Pr(>|t|)    
## beertax     0.0581056  0.0527138  1.1023 0.2712746    
## jailyes     0.1431230  0.0398392  3.5925 0.0003861 ***
## serviceyes -0.1659102  0.0454909 -3.6471 0.0003156 ***
## drinkage   -0.0179305  0.0048585 -3.6906 0.0002683 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    0.85673
## Residual Sum of Squares: 0.76982
## R-Squared:      0.10145
## Adj. R-Squared: -0.060485
## F-statistic: 7.98754 on 4 and 283 DF, p-value: 4.1001e-06
## Linear mixed model fit by REML ['lmerMod']
## Formula: afatal_prop ~ beertax + jail + service + drinkage_cat + (1 |  
##     state)
## 
## REML criterion at convergence: -870.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.1340 -0.5009 -0.0677  0.3450  4.3716 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  state    (Intercept) 0.003654 0.06044 
##  Residual             0.002775 0.05268 
## Number of obs: 335, groups:  state, 48
## 
## Fixed effects:
##                 Estimate Std. Error t value
## (Intercept)     0.323416   0.023070  14.019
## beertax         0.028628   0.018412   1.555
## jailyes         0.051174   0.020429   2.505
## serviceyes     -0.060080   0.023645  -2.541
## drinkage_cat19  0.004721   0.020154   0.234
## drinkage_cat20 -0.003560   0.021885  -0.163
## drinkage_cat21 -0.029759   0.019295  -1.542
## 
## Correlation of Fixed Effects:
##             (Intr) beertx jailys srvcys drn_19 drn_20
## beertax     -0.455                                   
## jailyes     -0.165  0.105                            
## serviceyes   0.074 -0.140 -0.707                     
## drinkg_ct19 -0.714  0.021 -0.017  0.006              
## drinkg_ct20 -0.684  0.053  0.006 -0.021  0.741       
## drinkg_ct21 -0.798  0.065  0.018 -0.052  0.852  0.809
## Analysis of Variance Table
##              npar   Sum Sq   Mean Sq F value
## beertax         1 0.005926 0.0059257  2.1353
## jail            1 0.002295 0.0022946  0.8268
## service         1 0.024149 0.0241490  8.7021
## drinkage_cat    3 0.036160 0.0120535  4.3435

if i use year as random effect, it seems that drink age does not really correlate to alcohol fatalities.
if i use state as random effect, jail is not a significant predictor of alcohol fatalities

## 
## Call:
## lm(formula = afatal_prop ~ beertax + jail + service + drinkage_cat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.15167 -0.04933 -0.01266  0.03183  0.31359 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.333335   0.021241  15.693  < 2e-16 ***
## beertax         0.021209   0.009058   2.341  0.01981 *  
## jailyes         0.029612   0.011361   2.606  0.00957 ** 
## serviceyes     -0.033531   0.013259  -2.529  0.01191 *  
## drinkage_cat19 -0.016357   0.022627  -0.723  0.47026    
## drinkage_cat20 -0.013554   0.024907  -0.544  0.58668    
## drinkage_cat21 -0.030126   0.020851  -1.445  0.14947    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.07785 on 328 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.0539, Adjusted R-squared:  0.03659 
## F-statistic: 3.114 on 6 and 328 DF,  p-value: 0.005561
## Analysis of Variance Table
## 
## Response: afatal_prop
##               Df  Sum Sq  Mean Sq F value   Pr(>F)   
## beertax        1 0.02485 0.024846  4.0999 0.043697 * 
## jail           1 0.01760 0.017596  2.9035 0.089333 . 
## service        1 0.04768 0.047679  7.8676 0.005333 **
## drinkage_cat   3 0.02312 0.007708  1.2719 0.283966   
## Residuals    328 1.98771 0.006060                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Trying out with transformation of variables

seems like it’s slightly better, but its way harder to interpret the findings now

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = inv_afatal_prop ~ beertax + jail + service + drinkage, 
##     data = Fatalities, model = "within", index = "state")
## 
## Unbalanced Panel: n = 48, T = 6-7, N = 335
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -1.7839032 -0.2311134 -0.0084016  0.2317763  3.3822809 
## 
## Coefficients:
##             Estimate Std. Error t-value  Pr(>|t|)    
## beertax    -0.560960   0.473780 -1.1840 0.2374022    
## jailyes    -0.769675   0.358065 -2.1495 0.0324394 *  
## serviceyes  1.061066   0.408862  2.5952 0.0099473 ** 
## drinkage    0.167551   0.043667  3.8370 0.0001536 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    68.028
## Residual Sum of Squares: 62.186
## R-Squared:      0.085881
## Adj. R-Squared: -0.078854
## F-statistic: 6.64693 on 4 and 283 DF, p-value: 3.9891e-05
## Linear mixed model fit by REML ['lmerMod']
## Formula: inv_afatal_prop ~ beertax + jail + service + drinkage_cat + (1 |  
##     state)
##    Data: Fatalities
## 
## REML criterion at convergence: 564
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7833 -0.5384 -0.0229  0.4705  6.9891 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  state    (Intercept) 0.2923   0.5406  
##  Residual             0.2201   0.4691  
## Number of obs: 335, groups:  state, 48
## 
## Fixed effects:
##                Estimate Std. Error t value
## (Intercept)     3.22168    0.20577  15.657
## beertax        -0.24334    0.16453  -1.479
## jailyes        -0.35388    0.18244  -1.940
## serviceyes      0.50986    0.21115   2.415
## drinkage_cat19 -0.07405    0.17954  -0.412
## drinkage_cat20  0.01671    0.19496   0.086
## drinkage_cat21  0.28737    0.17190   1.672
## 
## Correlation of Fixed Effects:
##             (Intr) beertx jailys srvcys drn_19 drn_20
## beertax     -0.455                                   
## jailyes     -0.165  0.105                            
## serviceyes   0.075 -0.140 -0.708                     
## drinkg_ct19 -0.713  0.021 -0.017  0.005              
## drinkg_ct20 -0.683  0.053  0.006 -0.021  0.741       
## drinkg_ct21 -0.797  0.065  0.018 -0.052  0.852  0.809
## Analysis of Variance Table
##              npar Sum Sq Mean Sq F value
## beertax         1 0.4591 0.45910  2.0861
## jail            1 0.0108 0.01075  0.0489
## service         1 1.8334 1.83345  8.3309
## drinkage_cat    3 3.8888 1.29628  5.8901
## 
## Call:
## lm(formula = inv_afatal_prop ~ beertax + jail + service + drinkage_cat, 
##     data = Fatalities)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7803 -0.4054 -0.0128  0.4130  2.4503 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     3.03998    0.19093  15.922  < 2e-16 ***
## beertax        -0.18028    0.08142  -2.214  0.02750 *  
## jailyes        -0.28295    0.10212  -2.771  0.00591 ** 
## serviceyes      0.33929    0.11918   2.847  0.00469 ** 
## drinkage_cat19  0.20729    0.20338   1.019  0.30885    
## drinkage_cat20  0.18321    0.22388   0.818  0.41376    
## drinkage_cat21  0.42582    0.18742   2.272  0.02373 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6997 on 328 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.07924,    Adjusted R-squared:  0.06239 
## F-statistic: 4.704 on 6 and 328 DF,  p-value: 0.0001305
## Analysis of Variance Table
## 
## Response: inv_afatal_prop
##               Df  Sum Sq Mean Sq F value   Pr(>F)   
## beertax        1   1.795  1.7949  3.6658 0.056410 . 
## jail           1   1.632  1.6322  3.3335 0.068789 . 
## service        1   5.346  5.3457 10.9180 0.001058 **
## drinkage_cat   3   5.047  1.6824  3.4361 0.017215 * 
## Residuals    328 160.597  0.4896                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1